Exploiting Speech Recognition Transcripts for Narrative Peak Detection in Short-Form Documentaries
نویسندگان
چکیده
Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. This paper reports on four approaches to narrative peak detection in television documentaries that were developed by a joint team consisting of members from Delft University of Technology and the University of Twente within the framework of the VideoCLEF 2009 Affect Detection task. The approaches make use of speech recognition transcripts and seek to exploit various sources of evidence in order to automatically identify narrative peaks. These sources include speaker style (word choice), stylistic devices (use of repetitions), strategies strengthening viewers’ feelings of involvement (direct audience address) and emotional speech. These approaches are compared to a challenging baseline that predicts the presence of narrative peaks at fixed points in the video, presumed to be dictated by natural narrative rhythm or production convention. Two approaches are tied in delivering top narrative peak detection results. One uses counts of first and second person pronouns to identify points in the video where viewers feel most directly involved. The other uses affective word ratings to calculate scores reflecting emotional language.
منابع مشابه
Narrative Peak Detection in Short-Form Documentaries Using Speech Recognition Transcripts
Narrative peaks are points at which the viewer perceives a spike in the level of dramatic tension within the narrative flow of a video. In this paper we describe two approaches for automatic identification of narrative peaks in short-form documentaries, within the framework of the VideoCLEF 2009 Affect Detection task. Both approaches exploit the speech recognition transcript in order to identif...
متن کاملOverview of VideoCLEF 2009: New Perspectives on Speech-based Multimedia Content Enrichment
VideoCLEF 2009 offered three tasks related to enriching video content for improved multimedia access in a multilingual environment. For each task, video data (Dutch-language television, predominantly documentaries) accompanied by speech recognition transcripts were provided. The Subject Classification Task involved automatic tagging of videos with subject theme labels. The best performance was ...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملAutomatic speech recognition in the diagnosis of primary progressive aphasia
Narrative speech can provide a valuable source of information about an individual’s linguistic abilities across lexical, syntactic, and pragmatic levels. However, analysis of narrative speech is typically done by hand, and is therefore extremely time-consuming. Use of automatic speech recognition (ASR) software could make this type of analysis more efficient and widely available. In this paper,...
متن کاملClassification of Dual Language Audio-Visual Content: Introduction to the VideoCLEF 2008 Pilot Benchmark Evaluation Task
VideoCLEF is a new track for the CLEF 2008 campaign. This track aims to develop and evaluate tasks in analyzing multilingual video content. A pilot of a Vid2RSS task involving assigning thematic class labels to video kicks off the VideoCLEF track in 2008. Task participants deliver classification results in the form of a series of feeds, one for each thematic class. The data for the task are dua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009